Python basics

language
Python is one of the most popular programming languages.

Basics

Using Python

In console:

  1. Use Python in console / terminal: Python

  2. Type your code

  3. Quit Python in console: quit()

Run Python scripts In console / terminal:

Py myScript.py

Python myScript.py

Get help with functions and features
Help menu for python help()
Help section on function help("func") or help(package.func)
Functions in library dir("package")

File system, Import, Input, Output

import package: import os, shutil

Use file system
print working directory os.getcwd()
Change working directory os.chdir("path/to/dir")
List files in working directory os.listdir()
Create directory os.mkdir("dir_name")
remove directory os.rmdir("dir_name")
Create file with open("./filename", "w"): pass
Get info on file (size, time of creation) os.path.isfile("filename")
Rename file os.rename()
Copy file shutil.copy("source_filename", "dest_filename")
Construct file path from directory list os.path.join("path", "to", "file")
Importing other scripts
import path.to.otherScript
Executes the contents of the script.
Print to standard output
print(f"my Output includes a {variable_1} and {variable_2}.")

Logging

The built-in library is logging, however loguru is easier to use: pip install loguru.

Send log-messages to log-file
from loguru import logger
logger.add("logfile.log")
logger.debug("blablabla")
logger.info("foo bar bash")

Operations, numbers, vectors, matrices

Assign values to a variable
Assign value to variable x = 4.5
Assign to multiple variables x = y = z = 4.5
Create a list
x = [1,2,3]

Matrices & vectors

import package: import numpy as np import scipy as sp

Create array / vector
x_arr = array([1,2,3])
Elementwise adding, subtracting, dividing, multiplying, … vectors
x_arr = np.array([1,2,3])
y_arr = np.array([4,5,6])
z_arr = x_arr * y_arr # or -, *, /
Common operators on vectors
maximum, minimum max(x) min(x)
Number of rows & columns x_arr.shape
Sum of the elements x_arr.sum()
Product of the elements x_arr.prod()
Mean of the elements x_arr.mean()
Variance of the elements x_arr.var()
Sort elements ascending np.sort(x_arr)
Sort elements descending np.sort(x_arr)[::-1]
Matrix multiplication np.matmul(x_arr, y_arr)
Dimension of matrix x_arr.shape
mode (highest count of val) sp.stats.mode(x_arr)
Percentile np.percentile(x_arr, 50)
Generate sequences
x_arr = np.arange(1,11,1) # Integers from 1 to 10
# Or
x_arr = np.arange(1,11,0.5) # 1.0, 1.5, 2.0, 2.5, ... 

Repeat vector:

np.tile(x_arr, reps=2) # 1, 2, 3, 1, 2, 3

Repeat elements in vector:

np.repeat(x_arr, repeats=2) # 1, 1, 2, 2, 3, 3

Selecting elements in vectors

Select first element in sequence
x_arr[0] # ! not x_arr[1] !
Selecting first 10 elements in vector
x_arr[0:10] # ! not x_arr[0:9]
Selecting non-missing elements in vector
x_arr[~np.isnan(x_arr)]
Append element
np.append(x_arr, values=[1,2,3])
Insert element
np.insert(x_arr, obj=2, values=[1,2,3]) # obj=index at which to insert
Delete element
np.delete(x_arr, obj=-1) # deletes last element
Create matrix/2D-array
x_arr = np.array([(1,2,3),(4,5,6)]) 
# [[1 2 3]
#  [4 5 6]]
Access matrix element
x_arr[1,1]
Access matrix column(s)
x_arr[:,2] # 3rd column
x_arr[:, 0:2] # 1st & 2nd column
Add rows and columns
np.vstack([x_arr, y_arr]) # add other array as rows
np.hstack([x_arr, y_arr]) # add other array as columns
Boolean operations
create boolean vector x_arr < 3 # [True, True, False]
Boolean operators <, <=, >, >=, ==, !=
and cond1 and cond2
or cond1 or cond2
not not cond
element in vector? x in [2,3,4]
identical np.array_equal(x_arr, y_arr, equal_nan=True)

If logical vectors are used in arithmetic operations, False becomes 0, True becomes 1.

Missing values
nan !: Operations with missing values return missing values. (nan + 1 is still nan)

Checking for missing values: np.isnan(x) (NaN = Not a number)

Assign value only to elements where condition is true
x_arr[np.isnan(x_arr)] = 4
Characters
Character string "..."
Escape character \
New line \n
Tab \t
length of string len(str)
Is seq of chars in string? "Halli Hallo".find("Hallo") # returns first idx: 6, if not found: -1
combine two strings into 1 "Halli " + "Hallo"

Concatenate arguments 1 by 1 as characters: " ".join(["Halli", "Hallo"])

Import package: import math

Operations on number
Absolute value abs(x)
round up to next int math.ceil(x)
round down to next int math.floor(x)
Exponent x**2
Modulus / remainder 10 % 3 # 1
Integer division 10 // 3 # 3

Types

Convert types
a = int(b)
n = float(m)
y = complex(x)
Get type of variable
type(x)

Dataframes

Contrary to arrays, the different columns of data frames can contain different data types.

import package: import pandas as pd

Construct data frame
df = pd.DataFrame(
  {"col1" : [1,2,3],
  "col2" : [12.4, 15.6, 16.9],
  "col3" : ["green", "blue", "white"]}
)
#    col1  col2   col3
# 0     1  12.4  green
# 1     2  15.6   blue
# 2     3  16.9  white
Functions on dataframes
Rename columns df.columns = ["rank", "result", "team"]
Get summary statistics on columns df.describe()
Access column df.iloc[:,1] or df["col1_name"] or df.col1_name
Add row, column df.loc[len(df)] = [4, 12.0, "black"], df["new_col"] = ["val1", "val2", "val3"]
Remove first row, column df.iloc[1:], df.iloc[:,1:]
Select row with max value of col1 df["col2"].idxmax()

Categorical values

You can store categorical values in pandas:

Create factor
df["col3"] = df["col3"].astype("category")

This will save memory and other python libraries will know that they should treat the column as categories.

Control structures

If … else …
if x < 5:
print("small")
elif x < 10:
print("medium")
else: 
print("big")
While loops
while(x < 10):
x += 1
print(x)
For loops
for x in x_list:
print(x)

Functions

Create function
def my_func(first_name, last_name = ""):
greeting = " ".join(["Hallo", first_name, last_name])
print(greeting)
return greeting
Call function:
my_func("Donald", "Duck")